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Abstract 

In recent years genome-wide association studies (GWAS) have uncovered numerous chromosomal loci associated with 
various electrocardiographic traits and cardiac arrhythmia predisposition. A considerable fraction of these loci lie within 
inter-genic regions. The underlying trait-associated variants likely reside in regulatory regions and exert their effect by 
modulating gene expression. Hence, the key to unraveling the molecular mechanisms underlying these cardiac traits is to 
interrogate variants for association with differential transcript abundance by expression quantitative trait locus (eQTL) 
analysis. In this study we conducted an eQTL analysis of human heart. For a total of 129 left ventricular samples that were 
collected from non-diseased human donor hearts, genome-wide transcript abundance and genotyping was determined 
using microarrays. Each of the 18,402 transcripts and 897,683 SNP genotypes that remained after pre-processing and 
stringent quality control were tested for eQTL effects. We identified 771 eQTLs, regulating 429 unique transcripts. 
Overlaying these eQTLs with cardiac GWAS loci identified novel candidates for studies aimed at elucidating the functional 
and transcriptional impact of these loci. Thus, this work provides for the first time a comprehensive eQTL map of human 
heart: a powerful and unique resource that enables systems genetics approaches for the study of cardiac traits. 
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Introduction 

It is well established that many cardiac traits and susceptibility 
to heart disease are heritable [1,2,3,4,5,6,7]. Several genome-wide 
association studies (GWAS) have uncovered common genetic 
variation, in the form of single nucleotide polymorphisms (SNPs), 
impacting on cardiac traits such as susceptibility to atrial 
fibrillation [8], ventricular fibrillation [9], heart rate [10] and 
electrocardiographic (ECG) indices of cardiac conduction 
[11,12,13,14] and repolarization [15,16]. There is widespread 
consensus that functional studies of GWAS-defined loci will 
advance our understanding of the molecular underpinnings of the 
associated traits. 

SNPs identified by GWAS are considered to impact the 
respective clinical phenotype, either direcdy or indirectly by virtue 
of linkage disequilibrium (LD) with the causal variant(s) in the 
context of a haplotype. Many trait-associated haplotypes occur in 



non-coding regions of the genome [17] and are hypothesized to 
modulate the respective trait through effects on gene expression 
[18]. Such SNPs are particularly challenging to understand 
because they may exert effects on the trait either by affecting the 
expression of a neighbouring gene (cir-effect) or the expression of a 
gene located elsewhere in the genome (franj-effects). One way of 
understanding GWAS signals thus entails interrogating trait- 
associated variants for association with differential transcript 
abundance by expression quantitative trait locus (eQTL) analysis. 
Studying gene expression level effects of disease-associated 
haplotypes has successfully uncovered the molecular mechanisms 
underlying loci associated with increased risk of myocardial 
infarction [19], coronary artery disease [20] and colorectal cancer 
[21]. In recent years, multiple genome-wide eQTL resources have 
become available for various tissues including brain, liver and 
adipose tissue [22,23,24,25,26,27,28,29]. Because eQTLs may be 
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Figure 1. Overview plots for top cis eQTLs. An overview of the 4 most significant c/'s eQTLs: rsl 1 150882 with C17orf97 (panel A), rs1 1 158569 with 
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tissue-specific, a similar resource for human heart is anticipated to 
have great value [23,29,30,31]. 

To this end, we have generated a human heart eQTL resource 
by genome-wide genotyping and determination of transcript 
abundance in 129 human donor heart samples. We subsequently 
overlaid previously identified cardiac trait GWAS signals with the 
identified eQTLs to identify candidate causal genes for the effects 
at these GWAS loci. This work provides an eQTL map of human 
heart, a resource that is likely to play an important role in 
furthering our understanding of the mechanisms associated with 
loci identified in GWAS on cardiac traits. 

Results 

General design of study 

We collected left ventricular samples from 180 non-diseased 
human hearts of unrelated organ donors whose hearts were 
explanted to obtain pulmonary and aortic valves for transplant 
surgery or explanted for heart transplantation but not used due to 
logistical reasons (e.g. no tissue-matched recipient was available). 
The subjects were assumed to be mainly of Western European 
descent. mRNA and DNA were isolated according to standard 
procedures. Transcript abundance was measured using the 
HumanHT-12 v4.0 whole genome array (Illumina) and genotyp- 
ing was carried out using the HumanOmniExpress genome-wide 
SNP arrays (Illumina). 

Data preprocessing and normalization 

Gene transcript abundance: Of the 47,231 transcripts whose 
expression levels were measured on the array, only those that were 
expressed above background level and for which the probe 
sequence mapped unambiguously to the genome and did not 



contain common SNPs, were used in further analyses. This 
procedure left 18,402 transcripts for eQTL analysis. Model-based 
background correction and normalization across arrays and 
transcripts was performed to correct for technical variance present 
in gene expression levels. A total of 162 arrays passed the 
standardized microarray gene expression quality control. 

Genotyping: Manhattan distance clustering and principal 
component analysis of the genotype data of 154 samples that 
were successfully genotyped, revealed 1 3 genetic outliers (Figure 
SI). To ensure a genetically homogenous group for further 
analysis, samples pertaining to these clusters were removed. An 
additional 12 samples were removed due to low call rate (<95%), 
high proportion of alleles identical-by-state (>95%), or extreme 
heterozygosity (FDR 1%). Only SNPs with a minor allele 
frequency (MAF) higher than 0.15 were considered in eQTL 
analysis. This cutoff was chosen to ensure sufficient power to 
detect eQTLs within a broad range of effect sizes (Figure S2). 
Imputation was performed using the HAPMAP Phase III data (see 
Materials & Methods for details). This left 129 samples (74 male, 
55 female; age 41 ±14), 18,402 transcripts and 897,683 SNPs for 
eQTL analysis. 

Genome-wide eQTL mapping 

Each of the measured transcripts was tested for association with 
all SNPs using linear modeling, taking age, sex and clinical/ 
university center as covariates. We thus identified 6402 significant 
eQTLs (FDR <0.05). To remove redundant signals and identify 
independent expression-controlling loci, we performed linkage- 
disequilibrium (LD)-pruning. For this we grouped SNPs exhibiting 
LD (r >0.6) into clusters, revealing 771 independent loci 
regulating 429 unique transcripts. These results are comparable 
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to eQTL studies in other non-diseased tissues of similar sample 
size [22,23,24,28,29]. 

Of these 771 eQTLs, 770 were or-eQTLs for 428 unique 
transcripts (p<2.82 x 10" 5 ; FDR <Q.05), where the associated 
SNPs lie within 1 Mb of the transcriptional start site (TSS) of the 
cognate transcript. For the four most significant ar-eQTLs, box- 
and-whisker plots and mean-standard-error plots for the individual 
genotypes are given in Figure 1. An overview of the most 
significant cw-eQTLs is given in Table 1 and the complete results 
are given in supplemental Table SI. 

Of the independent significant eQTLs, one was found to be in 
trans (p<2. 12 x 10" 1 FDR <0.05), with the expression of 
LOC644936 located on chromosome 5 being seemingly modulated 
by an eQTL (rs852423) on chromosome 7. However, as 
LOC644936 is a known pseudogene of ACTB and rs852423 is 
located within ACTB, we cannot rule out the possibility that 
rs852423 is in fact a as eQTL for ACTB rather than a trans eQTL 
for LOC644936. Using BLAST to align the microarray probe 
sequence of LOC644936 to the human transcriptome uncovered a 
partial match with ACTB in addition to a 100% match with 
LOC644936. 

Integration of eQTL data with cardiac GWAS loci 

In order to provide candidate genes for the reported heart- 
related GWAS loci, we listed the 102 SNPs previously associated 
with a cardiac trait at genome-wide statistical significance (Pg^ ^ 
5x10 8 ), representing 74 independent loci (LD-pruned with r > 
0.6, see Materials & Methods). These corresponded to loci 
associated with ventricular fibrillation/ sudden cardiac death, atrial 
fibrillation, heart rate, PR interval, QRS duration and QTc 
interval. Of these, the 64 SNPs that displayed a MAF of 15% or 
higher in the eQTL sample were overlaid with the eQTL data to 
identify transcripts under genetic regulation by these loci. All 
GWAS SNPs were tested for association with transcript levels of all 
18,402 transcripts in this study. We identified a as association 
between rs99 12468, a modulator of QRS duration [12] with the 
level of expression of the PRKCA transcript at genome-wide 
statistical significance (p = 2.90 x 10~ 9 , see Figure 2A). Besides 
PRKCA, no other GWAS SNP displayed an eQTL association p- 
value that passed the stringent Bonferroni-corrected p-value 
threshold (p<0.05/64 SNPs xl8,402 transcripts ~ 4xlO" H ). A 
total of 34 SNPs were associated with the transcript level of a gene 
at a p<0.05 (Table 2). Among these, rs8049607, a modulator of 
QTc-interval [16] was found to be associated in cis with the 
transcript level of LITAF (p<5 x 10" 4 , Figure 2C), and rs7612445 
and rs6882776, both associated with heart rate [10] were 
associated in cis with the transcript levels of GNB4 (p<2xl0 4 , 
Figure 2B) and MKX2-5 (p<6xl0~ 3 , Figure 2D), respectively. 
The number of nominal associations for the 64 cardiac trait- 
associated SNPs tested represents a more than 7-fold enrichment 
(p<0.05, see Materials & Methods) compared to a random 
selection of 64 variants from the entire set of SNPs used in eQTL 
analysis. 

Discussion 

We conducted a genome-wide eQTL analysis in 1 29 samples of 
normal human myocardium, identifying genetic variation regu- 
lating gene expression in human heart and uncovering 771 
genome-wide significant independent eQTLs. This resource, 
heretofore unavailable in human heart will contribute to 
advancing our understanding of the genetic mechanisms under- 
lying loci associated with cardiac traits. All but one of the eQTLs 
identified were as eQTLs. Other eQTL studies have identified 
only few trans eQTLs [22,24,28,29], illustrating the general 
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difficulty of detecting toznj-regulatory variants in eQTL studies 
[31,32]. Based on larger eQTL studies in other tissues 
[22,24,25,26,29] as many as 4000 independent cardiac cis eQTLs 
are expected to be present, hence the results presented here are a 
subset of this theoretical complete set of cardiac eQTLs. 

In recent years, many novel loci associated with a number of 
cardiac traits, including cardiac arrhythmia and ECG indices, 
have been discovered. However, the identification of (novel) genes 
at these loci has lagged behind. The availability of a cardiac eQTL 
resource is likely to aid in the dissection of these loci by providing a 
means of prioritizing candidate genes for follow-up functional 
studies. Indeed, our current findings already provide candidate 
genes for a number of these loci (Table 2). One such example is 
the PRKCA gene for the effect observed on QRS duration for the 
rs9912468-tagged haplotype on chromosome 9. PRKCA encodes 
protein kinase C alpha, a fundamental regulator of cardiac 
contractility and Ca z+ handling in cardiomyocytes [33]. The 
mechanism by which it regulates QRS duration is unknown. 
Other candidates include the LITAF gene (encoding lipopolysac- 
charide-induced TNF factor) for the rs804960 7 -tagged haplotype 
associated with QTc-interval and the GMB4 gene (encoding 
guanine nucleotide binding protein) for the rs7612445-tagged 
haplotype associated with heart rate. None of these eQTLs (for 
PRKCA, LITAF and GJVB4) have been previously identified in non- 
cardiac tissues. 

The utility of this approach is further evidenced by the fact that 
the 64 GWAS SNPs were enriched in nominally significant eSNPs 
as compared to a random selection of 64 variants from the entire 
set of SNPs used in eQTL analysis. Such an enrichment was 
reported before for GWAS loci in general based on eQTLs 
identified in lymphoblastoid cell lines from HAPMAP samples 
[18]. 



The eQTLs we identified represent an enriched set of highly 
relevant candidates to test in future studies for association with 
cardiac traits and disease. Among the highly significant eQTLs 
listed in Table 1, at least two SNPs could also be interesting from 
a pharmacogenetic point of view. One is rs 1222809 which was 
found to be strongly associated with the expression level of the 
DHFR gene encoding dihydrofolate reductase, a putative target of 
the drug methotrexate. Of note previous studies have provided 
evidence that rs 1650697, which is in complete LD with rs 1222809, 
may be associated with adverse events to methotrexate in patients 
with rheumatoid arthritis [34,35]. The other potentially interesting 
eQTL from a pharmacogenetic point of view is rs4822466 which 
was found to be highly associated with the expression of GSTT1, a 
gene encoding the liver detoxifying enzyme Glutathione S- 
transferase Tl. 

The eQTLs we identified are expected to be enriched in the 
regulatory regions of the genome such as promoter regions, 
enhancers and transcription factor binding sites [36]. Recent work 
has begun to uncover these relationships for adult human heart 
[37]. However, formal testing for enrichment of eQTLs in the 
known regulatory regions [37] did not provide statistically 
significant enrichment (data not shown). At least in part, this 
may be due to the limited number of eQTLs we have identified. 

A limitation of the presented study concerns the fact that not all 
transcripts have been tested for eQTL effects. Transcripts that 
were expressed below the (array-based) detection level or for which 
probe design was not optimal could not be tested. Conversely, not 
all haplotypes in the genome were tested as for instance we only 
tested SNPs with a MAF higher than 0.15. Furthermore, our 
sample size and therefore statistical power was limited, preventing 
the identification of eQTLs of smaller effect and trans eQTLs. The 
interpretation of the data concerning SNPs from GWAS presented 
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in Table 2 must take these considerations into account. 
Additionally, the single trans eQTL we identified is likely a false 
discovery and will require further investigation. 

Our study was conducted in left ventricular myocardium. 
However, it is well known that different cardiac compartments 
such as the atria or the specialized conduction system display 
different gene expression patterns [38,39,40,41] and eQTL effects 
might thus differ across cardiac compartments. Furthermore, we 
have no information relating to cardiac traits such as ECG indices 
in the 129 individuals from whom the left ventricular samples were 
obtained; we were therefore unable to correlate gene expression 
with cardiac traits in these individuals [23,42]. 

In summary, we here provide the first eQTL map of human left 
ventricular myocardium that will enable systems genetics ap- 
proaches in the study of cardiac traits. 

Materials and Methods 

Ethics statement 

Investigations using the human ventricular samples conformed 
to the principles outlined in the Helsinki Declaration of the World 
Medical Association. The ethical review boards of University of 
Szeged (Ethical Review Board of the University of Szeged Medical 
Center; Szeged, Hungary), Vanderbilt University (Institutional 
Review Board of Vanderbilt University School of Medicine; 
Nashville, USA), University of Miami (Institutional Review Board 
of the University of Miami School of Medicine; Miami, USA), and 
the University of Sydney (Human Research Ethics Committee 
(HREC); Sydney, Australia) approved procurement and handling 
of the human cardiac material. Written informed consent from the 
donor or the next of kin was obtained for use of this sample in 
research. All data was analyzed anonymously. 

Sample collection 

Left ventricular samples were obtained from 180 non-diseased 
human hearts of unrelated organ donors whose hearts were 
explanted to obtain pulmonary and aortic valves for transplant or 
valve replacement surgery or explanted for transplantation but not 
used due to logistical reasons. The tissues were ascertained at the 
University of Szeged (Hungary; n = 79), Vanderbilt University 
(NashviUe, USA; n = 46), University of Miami (USA; n = 30), and 
the University of Sydney (Australia; n = 25) and assumed to consist 
mainly of subjects of Western European descent based on self- 
reported ethnicity. The Vanderbilt samples were procured with 
the assistance of the National Disease Research Interchange 
(Philadelphia, PA). 

Generation and processing of gene expression data 

Total RNA was extracted from the human left ventricular heart 
samples using the mzVVana miRNA isolation kit (Ambion) at the 
AMC, Amsterdam, The Netherlands. Sample processing order 
was randomized. RNA quality was assessed by Agilent Bioanalyzer 
(minimum RIN = 7) and spectrophotometry (minimum 
260 nm:280 nm= 1.8). The IUumina TotalPrep-96 RNA Ampli- 
fication Kit was used to generate cRNA starting from 200 ng total 
RNA. Genome-wide gene expression data was generated using 
Illumina HumanHT-12 v4 BeadArrays, containing 47,231 probes 
representing 28,688 RefSeq annotated transcripts (ServiceXS, 
Leiden, The Netherlands), following the instructions of the 
manufacturer. 

Raw expression data were imported into the Illumina Bead- 
Studio and summarized at probe-level for each sample without 
normalization or background correction. The summarized data 
were subsequently imported into R (version 2.15.3) [43] using the 
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beadarray package [44]. Quality control was performed using the 
ArrayQualityMetrics package in R [45]. Samples displaying 
transcriptional stratification using hierarchical clustering were 
omitted from the analysis. The summarized data of the 162 
remaining samples was background corrected and quantile 
normalized using the neqc algorithm [46] across all samples. The 
neqc algorithm is the current standard data-preprocessing method 
for Illumina gene expression BeadArrays [47], and has been 
applied in eQTL studies with comparable sample size [29,30]. 

Probes containing common SNPs (HAPMAP Phase III release 
2) [27,29] and probes whose sequence did not align or aligned 
ambiguously to the human reference genome (HG19), according 
to up-to-date Illumina HumanHT-12 v4.0 BeadArray annotation 
available from the Bioconductor project, were left out of the 
analysis. Additionally, probes with median expression levels below 
a study specific threshold (the median expression levels of Y 
chromosome transcripts in the female subjects of the sample 
population) were not considered for subsequent analyses. 

Genotyping and genotype imputation 

DNA was extracted for genotyping from 162 heart samples that 
passed the gene expression analysis quality control criteria (see 
above) at the AMC, Amsterdam, The Netherlands. Genome-wide 
SNP genotyping was carried out using Illumina HumanOmniEx- 
press Beadchips interrogating 733,202 genetic markers (Genome 
Analysis Center, Helmholtz Zentrum Mtinchen, Germany). A 
total of 8 samples had sample quality issues (and were not 
hybridized) or failed hybridization, leaving genotype data for 154 
samples. Quality control was performed in the GenABEL [48] 
package in R using default settings. Samples with low call rate (< 
95%), extreme heterozygosity (FDR 1%) or high proportion of 
alleles identical-by-state (>95%) were removed. Additionally, any 
remaining samples showing genetic stratification through Man- 
hattan distance hierarchical clustering (using the popgen [49] 
package in R), and confirmed with principal component analysis 
[48], were not considered (Figure SI). 

Power calculations were performed (with a fixed FDR of 0.05) 
to assess the influence of MAF on power in relation to observed 
gene expression fold changes. Based on these results, a MAF 
threshold of 0.15 was chosen to ensure sufficient power to detect cis 
eQTLs within a broad range of effect sizes (Figure S2). 
Additionally, assuming Hardy-Weinberg equilibrium, a MAF of 
0.15 or higher yields an expected number of three individuals 
homozygous for the minor allele, which we considered the 
minimum for fitting a meaningful additive genetic model. 

Imputation was performed using the MACH software [50] and 
the HAPMAP Phase III data. Only SNPs imputed with sufficient 
confidence were considered, using the estimate of the squared 
correlation between imputed and true genotypes. By setting the 
cut-off at 0.30, most of the poorly imputed SNPs are filtered out, 
compared to only a small number (< 1 %) of well imputed SNPs 
[51]. 

eQTL statistical analysis 

After pre-processing and stringent quality control of gene 
expression and genotypic data as described above, a total of 129 
heart samples were used in eQTL analysis. Each transcript was 
tested for association with SNP genotypes genome-wide using 
linear modeling (assuming an additive genetic model), taking age, 
gender and tissue collection center as covariates, using the 
GenABEL package [48] in R. Correction for multiple testing was 
performed on the complete set of cis eQTL p-values in the qvalue 
package in R [52]. A q-value (FDR) £0.05 was considered 
significant for cis eQTLs, corresponding to a p-value of 



2.82x10 5 . Cis relations were defined as those within 1 Mb of a 
transcription start site (TSS), in accordance with previous reports 
demonstrating that over 90% of cis SNPs are situated within 
100 Kb of a TSS [26,27,29,47,53]. SNPs with an LD R 2 of larger 
than 0.6 were considered dependent and LD-pruned into clusters 
(LD clusters), in accordance with previous studies [23,29,30]. For 
trans eQTLs, only results with a p-value <5xl0 _!! were 
considered (corresponding to a target ot (or p value) of 0.05 with 
a Bonferroni correction for 1 million independent tests [54,55]). 
Correction for multiple testing was done by using a step-up 
Benjamini & Hochberg procedure on all p-values <5xl0 8 , and 
a q-value (FDR) £0. 05 was considered genome-wide significant 
for trans eQTLs, corresponding to a p-value of 2.12x10 11 . 

eQTL biological interpretation and candidate gene 
prioritization 

To prioritize candidate genes for further studies, additional data 
sources were integrated. Additional trait and disease associated 
SNPs were extracted from PubMed (www.ncbi.nlm.nih.gov/ 
pubmed; search terms: 'GWAS' AND 'cardiac', 'atrial fibrillation', 
'sudden cardiac death', 'ECG [electrocardiographic]', 'PR inter- 
val', 'QRS', 'QT, 'repolarization'), the NHGRI catalog of 
published GWAS (http://www.genome.gov/gwastudies/), and 
GWAS central (https:/ /www.gwascentral.org) on January 8, 
2013. Analyses were restricted to samples of European ancestry. 
Results were classified into six categories: sudden cardiac death, 
atrial fibrillation, heart rate, PR duration, QRS duration and QTc 
duration. Next, each GWAS SNP passing genome-wide signifi- 
cance in the respective study (5xl0 _!! , a target a of 0.05 with a 
Bonferroni correction for 1 million independent tests) was tested 
for association with expression of all 18,402 measured transcripts. 
To determine the number of independent loci, LD-pruning was 
performed by merging all GWAS SNPs with LD r 2 >0.6 
(HAPMAP R22 and HAPMAP Phase III). The p-value threshold 
for significant eQTL effects was set at 4xl0 -8 , a target a of 0.05 
with a Bonferroni correction for 1,177,728 tests (64 independent 
loci x 18,402 transcripts). 

To quantify the enrichment of eQTLs among the cardiac trait 
GWAS SNPs, we generated 100,000 randomized independent 
SNP sets of the same size as the number of independent GWAS 
loci, and with corresponding MAF distribution and proximity to 
genes. The number of nominally significant eQTL associations for 
the original independent GWAS loci is referred to as Q. Next, for 
each random set S ; , we determined the number of eQTLs at 
nominal significance (p^O.05), referred to as Q. The simulations 
yielded a fold-enrichment score, calculated as the average over all 
random sets of the ratio between Q and Q, and an empirical p- 
value, calculated as the proportion of simulations in which the 
number of eQTLs exceeds the number of nominally significant 
eQTL associations in the original independent GWAS loci. 

Public access to microarray data 

The microarray genotyping and gene expression data of the 
study have been deposited online at the Gene Expression 
Omnibus (GEO), with accession number GSE55232. 

Supporting Information 

Figure SI Manhattan distance hierarchical clustering 
dendogram of 154 genotyped subjects. Manhattan distance 
hierarchical clustering revealed several genotypic outiiers. The 
clustering was repeated using principal component analysis, 
identifying the same groups of outliers. 
(TIF) 
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Figure S2 Results of eQTL power analyses in relation to 
MAF and gene expression fold change. eQTL power 
analyses were performed for different minimum minor allele 
frequencies (0.05, 0.10, 0.15, 0.20, 0.30 and 0.40). The gene 
expression fold change is defined as log2 difference in gene 
expression observed per copy of the minor allele. In each analysis, 
for each log 2 fold change X, all eQTLs with an absolute log 2 fold 
change larger than X were considered, and the power was 
calculated as the percentage of those eQTLs for which the null 
hypothesis is rejected at FDR £0.05. 
(TIF) 

Table SI Table of all significant eQTLs. This table 
contains the complete results for all significant non-diseased 
human heart eQTLs (FDR <0.05). It contains for each SNP- 
transcript pair the SNP ID, gene or transcript IDs (HGNC, Entrez 
Gene, RefSeq), genomic locations, minor and major allele, minor 
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